16 research outputs found

    Near-optimal irrevocable sample selection for periodic data streams with applications to marine robotics

    Full text link
    We consider the task of monitoring spatiotemporal phenomena in real-time by deploying limited sampling resources at locations of interest irrevocably and without knowledge of future observations. This task can be modeled as an instance of the classical secretary problem. Although this problem has been studied extensively in theoretical domains, existing algorithms require that data arrive in random order to provide performance guarantees. These algorithms will perform arbitrarily poorly on data streams such as those encountered in robotics and environmental monitoring domains, which tend to have spatiotemporal structure. We focus on the problem of selecting representative samples from phenomena with periodic structure and introduce a novel sample selection algorithm that recovers a near-optimal sample set according to any monotone submodular utility function. We evaluate our algorithm on a seven-year environmental dataset collected at the Martha's Vineyard Coastal Observatory and show that it selects phytoplankton sample locations that are nearly optimal in an information-theoretic sense for predicting phytoplankton concentrations in locations that were not directly sampled. The proposed periodic secretary algorithm can be used with theoretical performance guarantees in many real-time sensing and robotics applications for streaming, irrevocable sample selection from periodic data streams.Comment: 8 pages, accepted for presentation in IEEE Int. Conf. on Robotics and Automation, ICRA '18, Brisbane, Australia, May 201

    Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models

    Full text link
    The gap between our ability to collect interesting data and our ability to analyze these data is growing at an unprecedented rate. Recent algorithmic attempts to fill this gap have employed unsupervised tools to discover structure in data. Some of the most successful approaches have used probabilistic models to uncover latent thematic structure in discrete data. Despite the success of these models on textual data, they have not generalized as well to image data, in part because of the spatial and temporal structure that may exist in an image stream. We introduce a novel unsupervised machine learning framework that incorporates the ability of convolutional autoencoders to discover features from images that directly encode spatial information, within a Bayesian nonparametric topic model that discovers meaningful latent patterns within discrete data. By using this hybrid framework, we overcome the fundamental dependency of traditional topic models on rigidly hand-coded data representations, while simultaneously encoding spatial dependency in our topics without adding model complexity. We apply this model to the motivating application of high-level scene understanding and mission summarization for exploratory marine robots. Our experiments on a seafloor dataset collected by a marine robot show that the proposed hybrid framework outperforms current state-of-the-art approaches on the task of unsupervised seafloor terrain characterization.Comment: 8 page

    Balancing exploration and exploitation: task-targeted exploration for scientific decision-making

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2022.How do we collect observational data that reveal fundamental properties of scientific phenomena? This is a key challenge in modern scientific discovery. Scientific phenomena are complex—they have high-dimensional and continuous state, exhibit chaotic dynamics, and generate noisy sensor observations. Additionally, scientific experimentation often requires significant time, money, and human effort. In the face of these challenges, we propose to leverage autonomous decision-making to augment and accelerate human scientific discovery. Autonomous decision-making in scientific domains faces an important and classical challenge: balancing exploration and exploitation when making decisions under uncertainty. This thesis argues that efficient decision-making in real-world, scientific domains requires task-targeted exploration—exploration strategies that are tuned to a specific task. By quantifying the change in task performance due to exploratory actions, we enable decision-makers that can contend with highly uncertain real-world environments, performing exploration parsimoniously to improve task performance. The thesis presents three novel paradigms for task-targeted exploration that are motivated by and applied to real-world scientific problems. We first consider exploration in partially observable Markov decision processes (POMDPs) and present two novel planners that leverage task-driven information measures to balance exploration and exploitation. These planners drive robots in simulation and oceanographic field trials to robustly identify plume sources and track targets with stochastic dynamics. We next consider the exploration- exploitation trade-off in online learning paradigms, a robust alternative to POMDPs when the environment is adversarial or difficult to model. We present novel online learning algorithms that balance exploitative and exploratory plays optimally under real-world constraints, including delayed feedback, partial predictability, and short regret horizons. We use these algorithms to perform model selection for subseasonal temperature and precipitation forecasting, achieving state-of-the-art forecasting accuracy. The human scientific endeavor is poised to benefit from our emerging capacity to integrate observational data into the process of model development and validation. Realizing the full potential of these data requires autonomous decision-makers that can contend with the inherent uncertainty of real-world scientific domains. This thesis highlights the critical role that task-targeted exploration plays in efficient scientific decision-making and proposes three novel methods to achieve task-targeted exploration in real-world oceanographic and climate science applications.This material is based upon work supported by the NSF Graduate Research Fellowship Program and a Microsoft Research PhD Fellowship, as well as the Department of Energy / National Nuclear Security Administration under Award Number DE-NA0003921, the Office of Naval Research under Award Number N00014-17-1-2072, and DARPA under Award Number HR001120C0033

    Statistical models and decision making for robotic scientific information gathering

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2018.Mobile robots and autonomous sensors have seen increasing use in scientific applications, from planetary rovers surveying for signs of life on Mars, to environmental buoys measuring and logging oceanographic conditions in coastal regions. This thesis makes contributions in both planning algorithms and model design for autonomous scientific information gathering, demonstrating how theory from machine learning, decision theory, theory of optimal experimental design, and statistical inference can be used to develop online algorithms for robotic information gathering that are robust to modeling errors, account for spatiotemporal structure in scientific data, and have probabilistic performance guarantees. This thesis first introduces a novel sample selection algorithm for online, irrevocable sampling in data streams that have spatiotemporal structure, such as those that commonly arise in robotics and environmental monitoring. Given a limited sampling capacity, the proposed periodic secretary algorithm uses an information-theoretic reward function to select samples in real-time that maximally reduce posterior uncertainty in a given scientific model. Additionally, we provide a lower bound on the quality of samples selected by the periodic secretary algorithm by leveraging the submodularity of the information-theoretic reward function. Finally, we demonstrate the robustness of the proposed approach by employing the periodic secretary algorithm to select samples irrevocably from a seven-year oceanographic data stream collected at the Martha’s Vineyard Coastal Observatory off the coast of Cape Cod, USA. Secondly, we consider how scientific models can be specified in environments – such as the deep sea or deep space – where domain scientists may not have enough a priori knowledge to formulate a formal scientific model and hypothesis. These domains require scientific models that start with very little prior information and construct a model of the environment online as observations are gathered. We propose unsupervised machine learning as a technique for science model-learning in these environments. To this end, we introduce a hybrid Bayesian-deep learning model that learns a nonparametric topic model of a visual environment. We use this semantic visual model to identify observations that are poorly explained in the current model, and show experimentally that these highly perplexing observations often correspond to scientifically interesting phenomena. On a marine dataset collected by the SeaBED AUV on the Hannibal Sea Mount, images of high perplexity in the learned model corresponded, for example, to a scientifically novel crab congregation in the deep sea. The approaches presented in this thesis capture the depth and breadth of the problems facing the field of autonomous science. Developing robust autonomous systems that enhance our ability to perform exploratory science in environments such as the oceans, deep space, agricultural and disaster-relief zones will require insight and techniques from classical areas of robotics, such as motion and path planning, mapping, and localization, and from other domains, including machine learning, spatial statistics, optimization, and theory of experimental design. This thesis demonstrates how theory and practice from these diverse disciplines can be unified to address problems in autonomous scientific information gathering

    Statistical models and decision making for robotic scientific information gathering

    Get PDF
    Thesis: S.M., Joint Program in Applied Ocean Physics and Engineering (Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science; and the Woods Hole Oceanographic Institution), 2018.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 97-107).Mobile robots and autonomous sensors have seen increasing use in scientific applications, from planetary rovers surveying for signs of life on Mars, to environmental buoys measuring and logging oceanographic conditions in coastal regions. This thesis makes contributions in both planning algorithms and model design for autonomous scientific information gathering, demonstrating how theory from machine learning, decision theory, theory of optimal experimental design, and statistical inference can be used to develop online algorithms for robotic information gathering that are robust to modeling errors, account for spatiotemporal structure in scientific data, and have probabilistic performance guarantees. This thesis first introduces a novel sample selection algorithm for online, irrevocable sampling in data streams that have spatiotemporal structure, such as those that commonly arise in robotics and environmental monitoring. Given a limited sampling capacity, the proposed periodic secretary algorithm uses an information-theoretic reward function to select samples in real-time that maximally reduce posterior uncertainty in a given scientific model. Additionally, we provide a lower bound on the quality of samples selected by the periodic secretary algorithm by leveraging the submodularity of the information-theoretic reward function. Finally, we demonstrate the robustness of the proposed approach by employing the periodic secretary algorithm to select samples irrevocably from a seven-year oceanographic data stream collected at the Martha's Vineyard Coastal Observatory off the coast of Cape Cod, USA. Secondly, we consider how scientific models can be specified in environments - such as the deep sea or deep space - where domain scientists may not have enough a priori knowledge to formulate a formal scientific model and hypothesis. These domains require scientific models that start with very little prior information and construct a model of the environment online as observations are gathered. We propose unsupervised machine learning as a technique for science model-learning in these environments. To this end, we introduce a hybrid Bayesian-deep learning model that learns a nonparametric topic model of a visual environment. We use this semantic visual model to identify observations that are poorly explained in the current model, and show experimentally that these highly perplexing observations often correspond to scientifically interesting phenomena. On a marine dataset collected by the SeaBED AUV on the Hannibal Sea Mount, images of high perplexity in the learned model corresponded, for example, to a scientifically novel crab congregation in the deep sea. The approaches presented in this thesis capture the depth and breadth of the problems facing the field of autonomous science. Developing robust autonomous systems that enhance our ability to perform exploratory science in environments such as the oceans, deep space, agricultural and disaster-relief zones will require insight and techniques from classical areas of robotics, such as motion and path planning, mapping, and localization, and from other domains, including machine learning, spatial statistics, optimization, and theory of experimental design. This thesis demonstrates how theory and practice from these diverse disciplines can be unified to address problems in autonomous scientific information gathering.by Genevieve Elaine Flaspohler.S.M

    Adaptive Bias Correction for Improved Subseasonal Forecasting

    Full text link
    Subseasonal forecasting \unicode{x2013} predicting temperature and precipitation 2 to 6 weeks \unicode{x2013} ahead is critical for effective water allocation, wildfire management, and drought and flood mitigation. Recent international research efforts have advanced the subseasonal capabilities of operational dynamical models, yet temperature and precipitation prediction skills remains poor, partly due to stubborn errors in representing atmospheric dynamics and physics inside dynamical models. To counter these errors, we introduce an adaptive bias correction (ABC) method that combines state-of-the-art dynamical forecasts with observations using machine learning. When applied to the leading subseasonal model from the European Centre for Medium-Range Weather Forecasts (ECMWF), ABC improves temperature forecasting skill by 60-90% and precipitation forecasting skill by 40-69% in the contiguous U.S. We couple these performance improvements with a practical workflow, based on Cohort Shapley, for explaining ABC skill gains and identifying higher-skill windows of opportunity based on specific climate conditions.Comment: 16 pages of main paper and 2 pages of appendix tex

    Streaming Scene Maps for Co-Robotic Exploration in Bandwidth Limited Environments

    Full text link
    This paper proposes a bandwidth tunable technique for real-time probabilistic scene modeling and mapping to enable co-robotic exploration in communication constrained environments such as the deep sea. The parameters of the system enable the user to characterize the scene complexity represented by the map, which in turn determines the bandwidth requirements. The approach is demonstrated using an underwater robot that learns an unsupervised scene model of the environment and then uses this scene model to communicate the spatial distribution of various high-level semantic scene constructs to a human operator. Preliminary experiments in an artificially constructed tank environment as well as simulated missions over a 10mĂ—\times10m coral reef using real data show the tunability of the maps to different bandwidth constraints and science interests. To our knowledge this is the first paper to quantify how the free parameters of the unsupervised scene model impact both the scientific utility of and bandwidth required to communicate the resulting scene model.Comment: 8 pages, 6 figures, accepted for presentation in IEEE Int. Conf. on Robotics and Automation, ICRA '19, Montreal, Canada, May 201

    Online Learning with Optimism and Delay

    Full text link
    Inspired by the demands of real-time climate and weather forecasting, we develop optimistic online learning algorithms that require no parameter tuning and have optimal regret guarantees under delayed feedback. Our algorithms -- DORM, DORM+, and AdaHedgeD -- arise from a novel reduction of delayed online learning to optimistic online learning that reveals how optimistic hints can mitigate the regret penalty caused by delay. We pair this delay-as-optimism perspective with a new analysis of optimistic learning that exposes its robustness to hinting errors and a new meta-algorithm for learning effective hinting strategies in the presence of delay. We conclude by benchmarking our algorithms on four subseasonal climate forecasting tasks, demonstrating low regret relative to state-of-the-art forecasting models.Comment: ICML 2021. 9 pages of main paper and 26 pages of appendix tex

    Quantifying the swimming gaits of veined squid (Loligo forbesi) using bio-logging tags

    Get PDF
    Author Posting. © Company of Biologists, 2019. This article is posted here by permission of Company of Biologists for personal use, not for redistribution. The definitive version was published in Journal of Experimental Biology 222 (2019):jeb.198226, doi: 10.1242/jeb.198226.Squid are mobile, diverse, ecologically important marine organisms whose behavior and habitat use can have substantial impacts on ecosystems and fisheries. However, as a consequence in part of the inherent challenges of monitoring squid in their natural marine environment, fine-scale behavioral observations of these free-swimming, soft-bodied animals are rare. Bio-logging tags provide an emerging way to remotely study squid behavior in their natural environments. Here, we applied a novel, high-resolution bio-logging tag (ITAG) to seven veined squid, Loligo forbesii, in a controlled experimental environment to quantify their short-term (24 h) behavioral patterns. Tag accelerometer, magnetometer and pressure data were used to develop automated gait classification algorithms based on overall dynamic body acceleration, and a subset of the events were assessed and confirmed using concurrently collected video data. Finning, flapping and jetting gaits were observed, with the low-acceleration finning gaits detected most often. The animals routinely used a finning gait to ascend (climb) and then glide during descent with fins extended in the tank's water column, a possible strategy to improve swimming efficiency for these negatively buoyant animals. Arms- and mantle-first directional swimming were observed in approximately equal proportions, and the squid were slightly but significantly more active at night. These tag-based observations are novel for squid and indicate a more efficient mode of movement than suggested by some previous observations. The combination of sensing, classification and estimation developed and applied here will enable the quantification of squid activity patterns in the wild to provide new biological information, such as in situ identification of behavioral states, temporal patterns, habitat requirements, energy expenditure and interactions of squid through space–time in the wild.This work was supported by Woods Hole Oceanographic Institution’s Ocean Life Institute and the Innovative Technology Program, Hopkins Marine Station’s Marine Life Observatory (to K.K.), as well as the National Science Foundation Program for Instrument Development for Biological Research (award no. 1455593 to T.A.M., K.K. and K.A.S.). F.C. thanks the Presidentís International Fellowship Initiative (PIFI) of the Chinese Academy of Science. G.E.F. thanks the National Science Foundation GRFP and National Science Foundation REU programs for support of this research.2020-10-2

    Near-optimal irrevocable sample selection for periodic data streams with applications to marine robotics

    No full text
    We consider the task of monitoring spatiotemporal phenomena in real-time by deploying limited sampling resources at locations of interest irrevocably and without knowledge of future observations. This task can be modeled as an instance of the classical secretary problem. Although this problem has been studied extensively in theoretical domains, existing algorithms require that data arrive in random order to provide performance guarantees. These algorithms will perform arbitrarily poorly on data streams such as those encountered in robotics and environmental monitoring domains, which tend to have spatiotemporal structure. We focus on the problem of selecting representative samples from phenomena with periodic structure and introduce a novel sample selection algorithm that recovers a near-optimal sample set according to any monotone submodular utility function. We evaluate our algorithm on a seven-year environmental dataset collected at the Martha’s Vineyard Coastal Observatory and show that it selects phytoplankton sample locations that are nearly optimal in an information-theoretic sense for predicting phytoplankton concentrations in locations that were not directly sampled. The proposed periodic secretary algorithm can be used with theoretical performance guarantees in many real-time sensing and robotics applications for streaming, irrevocable sample selection from periodic data streams
    corecore